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ABSTRACT 



The goal of this paper is to study possibihties of using first, second and third 
massive stars in open clusters to estimate total cluster mass and membership. 
We built estimator functions with the use of numerical simulations and analytical 
approximations and studied the precision and error distribution of the obtained 
estimator functions. We found that the distribution of the mass of first, second 
and third massive stars shows strong power-law tails at the high-mass end, thus 
it is better to use median or mode values instead of average ones. We show that 
the third massive star is a much better estimator then the first as it is more 
precise and less dependent on parameters such as maximum allowed stellar mass. 



1. Introduction 

Unfortunately, in most cases there does not exist a large body of statistics covering star 
cluster membership, so estimation of the cluster mass and the full number of members is 
not an easy task. In some cases dynamical mass estima tes were applied when spectrosco pic 
data is available (making use of the virial t heorem, see 



although this method can be imprecise (see 



Fleck et al. 



Kouwenhoven fc de Grijs 



mm), 



( 120061 )). Another method measures 



the total brightness of identifi ed members and extrapolate it using some luminosity function 



see 



Bonatto and Bica 



( 120051 )). However it is often the case that 



stars are reliably identified as members ( iKharchenko et al 



only a few of the brightest 



20031), which provides only a 



tiny amount of information about the cluster. 



It is natural to expe ct to find more massive stars in massive clusters. Although, 



assuming 



Salpeter 



( I1955I ) initial mass function (hereafter IMF) one would expect several 



stars with m asses M» ^ 3 00MrD in our Galaxy, which is not the case. This controversy was 



discussed by 



Elme green 



(|2000[ ) and he presented a relation between the cluster mass and 



the mass of it's most massive star (assuming random sampling from the Salpeter IMF): 



3 



Mel ~ 3 X 10 



3 / 



1.35 



VIOOMq, 

Elmegreen tried to introduce an exponential cut-off function to explain the absence of 
heavy stars, while maintaining randomness of sampling from the IMF. However this led 
to a contradiction with the observed mass functions of massive clusters, as Salpeter-like 
power law can be traced up to at least lOOM©. He proposed several explanations for this, 
including dependence of the IMF on the initial cluster mass. 



Kroupa's IMF (IKroupa I (120011 )) has become popular over the past decade as a standard 
cluster IMF. It is built from several power-law parts: 



f{m)dm = Cm °'dm 



F{m) 



With parameters (IKroupa 



(1200 ih ): 



f{m')dm' 



(2) 



ao = +0.30 0.01 < m/Mo < 0.08, 
ai = +1.30 0.08 < tti/Mq < 0.50, 
a2 = +2.35 0.50 < tti/Mq < m^^^. 

From Equation [3] we see that a sharp cut-off at a high-mass end was introduced, 
although the exact value o f mmax is left as a free parameter. Using this version of IMF, 



(3) 



Weidner fc Kroupa 



(120061 ) reviewed the correlation between the mass of the most massive 
star in the cluster and the total cluster mass. They arrived at the following conclusion: 
that random s ampling contradicts observatio ns (for a description of various samplings 
see Section [2]). iMaschberger fc Clarke I (120081) discussed correlation between the number 
of stars in the cluster and its most massive member mass, and found it compatible with 
random sampling. However, they only looked at a small range in cluster masse s. The se 



(and many more) papers were recently reviewed in 



Weidner. Kroupa &: Bonnell 



torn . 



Faustini et al. 



(120091 ) recently studied stellar clusters around young high-mass stars. 



They used Monte-Carlo simulations to try to reproduce properties of observed clusters. 
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Among other results, they found that the distributions for the total cluster mass and for the 
mass of the most massive star were skewed and suggested using the mode of the distribution 
instead of its mean or median values. 

In this paper we try to produce cluster mass (Md) and membership (A^) estimators 
using masses of three most massive members, and analyse the precision of these estimators. 
We concentrate on two questions: 

• what data provides the most reliable information on the cluster properties? 

• what is the best method to extract cluster properties from that data? 



Having these two goals in mind means we will neglect at least three very important factors: 
stellar binarity, stellar evolution and cluster dynamics. Stellar binarity can show itself 
directly by altering the IMF (if the process of binary star formation differs from that of a 
single star) or indirectly by unresolved binaries, that alte r the luminosity function and our 



assumptions about the IMF. This problem was studied by 



Weidner. Kroupa fc Maschberger 



( I2OO9I ): Stellar evolution is important for old clusters, as their heavy stars can evolve and 
turn to stellar remnants or lose a fraction of their initial mass, which is caused by stellar 
wind. Modelling this process requires a set of assumptio ns and is beyond the scope of 



this work. Note that 



Weidner. Kroupa fc Bonnell 



(l2010[ l studied young clusters which 



allowed them to neglect stellar evolution. Cluster dynamics leads to mass segregation and 
evaporation of the cluster with time. Although it is believed that the lighter stars are 
the most probable candidates to be ejected from th e cluster, this can also happen to the 



hea vier stars. Fo r more details on this problem see 



and 



Fleck et al. 



Pflamm-Altenburg fc Kroupa 



(120061) 



(120061 ). However, this effect is less important for young clusters. 



Of course, it is not possible to weigh stars directly, but we can estimate their mass, 
mostly done by measuring their brightness. However, this process is a very uncertain 
one. The other way to estimate stellar masses is to weigh binary members, if the orbit is 
known, but this raises problems of influence of the binarity on the IMF mentioned above. 
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But, again, as we are interested in the statistical side of the problem, we will postpone 
astrophysical difficulties for later research. 

If we assume all stars in the cluster to be single, then the (initial) mass of each star 
depends only o n the initial mass function. Here we will consider only one IMF — from 



Kroupa 



(120011 ) (see eq. [3]). The value of mmax is still a matter of debate, therefore several 
values will be considered in this paper (50, 150 and SOOM©), with the main focus being put 
on mjnax = 15OM0. Although mmax = 5OM0 is not a realistic value it is used here to study 
the dependence of the results on mmax- We will try to see how mmax influences the mass 
estimator precision. 

The following notation will be used throughout the paper: mi, rhi for the average 
and the median value of mi respectively (the same for m2 and m^). Another useful value 
is the position of the peak of the mi 2,3 distribution for a given Md or (mode of the 
distribution), which we designate as mi^2,3- Kroupa IMF (see Equation [3]) has an average 
stellar mass m = 0.36Mq for mmax = 15OM0. 



2. Model 



Following 



Weidner fc Kroupa 



(I2OO6I ). we used three different methods for generating 



cluster members: 



Random sampling — A^ stars are taken randomly from the IMF, with A^ ranging from 
300 to 10000. 

Constrained sampling — Md is fixed, then stars are taken from the IMF until their 
total mass surpass M^. Thus some spread in A^ is expected in this sample. 

Sorted sampling — Md is also fixed, then A^' = Md/rfi stars are taken from the IMF. If 
M' = XIat' smaller than Md, then AA^ = (Md — M')/fh stars are added to the 
cluster, giving a new A^' and M' and the procedure is repeated the cluster mass M' 
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surpasses Mc\. After this the stellar masses are sorted. If \M' — Mc\\ is larger then 
|Mci — (M' — mi) I then the heaviest star is removed from the set. 



According to IWeidner fc Kroupa I (120061 ). random sampling is the least realistic model, 



but the easiest to be modeled and described analytically. 

For each set of parameters (Md or A^, sampling, mmax)) 30000 clusters were simulated, 
and for each one five values were saved: cluster mass Md, number of stars in the cluster A^, 
and masses of the three most massive stars of the cluster — mi, m2, rris. 

The goal is to build a method to find Md and/or A^, when mi, m2 and ms is known. It 
seems natural to find functions Md(mi^2,3), Md(mi^2,3) and Md(mi^2,3) (as well as A^(mi^2,3), 
A^(mi 2,3) and A^(mi 2,3)). From here on they will be called mass estimators (ME): average 
ME, median ME and mode ME. 



3. Analytics 



The pr obability for t 



be written (Arnold et al. 



:h e mos t massive star to have mass mi G (m, m + dm) can 
(119921 )) as the probability for a given star to have mass in 
(m, m + dm) multiplied by the probability that all other stars have masses below m and by 
the number of stars A^ (because any star can be the most massive one): 



P{mi ^ {m,m + dm)) = Nf{m)[F{m)] 

= Nf{m) 1 - / f{m')dm' 



-1 N-l 



Of course, mi should be smaller then mmax, otherwise P = 0. 



(4) 



We can confidently use part of the Kroupa IMF (see Equation [3]) for m > O.SM©, as 
the most massive stars are usually much heavier than O.SMq. Substituting [2] into H] and 
integrating we get (for a 7^ 1): 

7V-1 



P{mi G (m, m + dm)) = NCm 



C 



1 — a 



{ml 



max 



(5) 
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where C is a normalisation constant. 

If is large, then we can use exponent instead of square brackets (and replace N — 1 
by for the sake of simplicity): 

P(mi G {m,m + dm)) ~ NCm~°' exp (^~^ — ~~ ("^max ~ m^~°')^ (6) 
The maximum of this distribution (or the mode of distribution) is located at the point 



'".= (—) ■ (7) 



For rhi > m^^^ the maximum is obviously at the point rhi = m^^y^. This puts an 
upper limit on the cluster mass that can be estimated with this formula. This is ~ 26000 
and thus Md = rfiN = 95OOM0. By inverting this equation we can get an estimate for 
and Mci from rhi: 



C 



M. = (8) 

Note that mmax is hidden within the constant C in these equations, although the 
dependence is weak. 

For the n'th massive star, if n <^ A^ we can use the expression: 

P(m„ G (m, m + dm)) c:^ {\ — F{m))^~^ P{mi G (m, m + dm)) (9) 

Finding average and median values for the equation [6] is not that easy. 

Building analytical expressions for the other sampling methods is a much more 
complicated task and is not discussed here. 
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4. Results 

4.1. Random sampling 

The random sampling model lias as a natural parameter, the number of stars in the 
cluster, A^. Here N ranges from 300 to 10000, with 30000 clusters being simulated for each 
value of A^. 

For each value of the distributions of mi, m2 and were calculated. An example of 
these distributions is shown in Figure [1] From the figure it can be seen that the theoretical 
estimates given by Equations [6] and [9] match the data well. Note the long power-law tails 
of the distributions, especially for mi. This tail leads to significant differences between the 
average and the median values, making the average much higher. Thus, averages are not so 
well suited to making cluster mass estimators. 

Now the task is to build a method to find and/or A^, knowing mi m2 and ms. We 
will try to find functions Mci(mi_2,3); j^ci(?^i,2.3) and Mc/(mi 2,3). These functions for M^i 
are shown in Figure El They can be approximated with functions of the shape: 

Mcl(mi,2,3) = a"^l,2,3("^max - "^1,2,3)" (10) 

so those functions rise as power laws for small m and then saturates as m goes to ISOM© 

The parameters of the fits are shown in Table [H The first column refers to one of the 
functions from Equation [TD], and the second, third and fourth columns are for different 
parameters applied to this function — a, b and c, respectively. For A^(mi_2,3) Eqautions 
[To] can be used (by setting c = and b = 1.35. For other estimators b is always close to 
0^2 — 1 = 1.35 and c is close to —1, although /(m) = am^'^^(150 — m)~^ is a bad fit. The 
value of c decreases from /(mi) to /(ms) — this is caused by the fact, that the values of 
ms are much smaller than mi and are well separated from mmax, therefore they are less 
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Fig. 1. — Distribution of mi (plus signs), m2 (crosses) and (stars) with theoretical 
estimates from Equations [6] and [9] (long-dashed, short-dashed and dot-dashed, respectively) 
for N = 1000 
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affected by saturation. Thus, firris) is less sensitive to the value of mmax- It should be also 
mentioned that a, b and c are highly correlated, so values given in the Table [T] might be not 
the only ones giving good fits. 

Given these approximations, we return to the initially simulated data to test how 
good they are. Namely, we will substitute mi 2,3 for each cluster into mass estimators (see 
Eq. [TOj) to get Mci(mj) and N{mi), which can then be compared to the real values. This 
produces some distributions of estimated Mci(mj) and N{mi). Errors of the estimation 
can be calculated, as |Mci(mj) — Mci\ and \N{mi) — N\. A sample of the result for N{mi) 
is shown in Figure [3] for a cluster with a pre-defined number of stars (A^ = 1000). Note 
that there is a large power-law tail at the high-mass side where goes to 10®, which is 
highest for N{mi) and smallest for N{m^) in all cases. Generally Nlm^) shows a smaller 
spread than other estimators. The distribution of N{ms) also peaks closer to the real value 
N = 1000. 

Tables [2] and [3] summarise the relative errors of mean and relative dispersions for 
various estimators and samplings. The average estimator is the worst one, giving the 
highest error of the average value in almost all cases — sometimes up to 23%, with high 
dispersion. The best one seems to be the median estimator, with errors of less than 2%. 
The mode estimator is even worse than the average estimator for random sampling (75% 
error), but it is better for the sorted one — which is a more realistic sampling. As expected, 
the result is due to the power-law tail of the distributions, to which the median (and mode) 
values are less sensitive. There is a high probability for rrii to be close to mmax? where 
estimator functions (see Eq. [TOj) are very sensitive to m,, thus producing a higher error and 
extremely large dispersions. The mode estimator is free from this effect by definition, as 
there is no (mmax — factor. 

The power-law tails of the distributions also cause extremely high dispersions for the 
estimates based on mi. Dispersions (and errors of the mean) are much smaller for estimators 
based on m2 and m^, as the slope of the power-law tail is significantly higher. Relative 
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10000 




10 100 10 100 10 100 

1X12 1x13 

Fig. 2. — Estimators data: dependencies of on mi (left), m2 (middle) and ms (right) for 
random (solid lines), constrained (long-dashed) and sorted (short-dashed) samphngs. Top 
row is for average values, middle is for medians and bottom for the mode. 
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dispersions are smallest for the mode estimator, but it has a higher relative error for the 
mean when compared with the median estimator. Note, that in most cases estimators based 
on show the best results both in terms of the error of the mean and the dispersions. 

Here we emphasise once again, that errors are distributed in a significantly non- 
Gaussian way in this problem. Using median values minimises the error for a high 
proportion of the data, while for a smaller proportion the errors remain large. 

4.2. Constrained sampling 

We applied almost the same algorithm, as in the random sampling case, for the 
constrained sampling case. The only change was that we did not have an analytical formula 
for rhi^2,3, and therefore had to use fits to the simulated data of the shape f{rh) = am^. 

In Figure H] one can see that the difference between random and constrained samplings 
is not very large in most cases. The distribution for constrained sampling rises and falls 
faster than the one for random sampling. The faster decrease at the distribution high end 
for the small cluster (Figure HJ top panel) is due to the fact that during the simulation the 
total mass comes close to the desired Md, massive stars are preferentially rejected from the 
sample, when adding them will make the cluster too massive. Obviously, this effect vanishes 
for higher Md, as one can see from the bottom panel in Figure HI 

4.3. Sorted sampling 



Sorted sampling should suppress the probability of high-mass star formation even more 
than constrained sampling. This can be seen on Figure [5j distribution of mi for sorted 
sampling is almost like that for m2 for random sampling at the high-end. 
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0.001 ■ i '■■ 

0.0001 ^ ■ ' — ^ — 

100 1000 10000 

N 

Fig. 3. — Distribution of estimates for a number of stars in the cluster A'"(mi^2,3) (solid, 
long-dashed and short-dashed hues, respectively) for a cluster with pre-defined 1000 stars. 
Estimators are based on average (upper left), median (upper right) and mode (bottom) 
values. 
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Fig. 4. — Normalised distribution of mi (solid), m2 (long-dashed) and (short-dashed) for 
constrained sampling (thick lines) and for random sampling (thin lines, same as Figure [3]). 
Top panel: N = 1000; ^ 300. Bottom panel: N = 4500; ^ 1500. 





- 15 - 




Fig. 5. — Normalised distribution of mi (solid), m2 (long-dashed) and (short-dashed) for 
sorted sampling (thick lines) and for random sampling (thin lines, same as at figure [3]). Top 
panel: N = 1000; ^ 300. Bottom panel: N = 4500; ^ 1500. 



- 16 - 



4.4. Estimator reliability 



We first attempt to compare the various estimators by comparing tlie data on which 
they are constructed. Let us return to Figure [21 It is obvious that the curves are very close 
to each other. This can also be seen from the similarities of fits parameters a, b, c (see Eq. 
[To] and Table [1]). So one might expect that predictions made with different estimators for 
the same value of m will not differ from each other significantly. This difference can be even 
smaller than the difference between the estimated and real value, as predictions can deviate 
from the real value in the same direction. We calculate average difference as: 



where i = 1, 2, 3 and rrii goes from 3 to 14OM0. This is a measure of how far away the 
estimators are from each other on average. The result is shown in Table [H Note that in 
most cases A/(m3) < A/(mi). Constrained sampling is much closer to random sampling 
than to sorted sampling (from 14 to 71%, comparing to 21-92%). The reason for this is, 
of course, that different samplings give different distributions that are used to produce 
estimators. This can be seen in Figure [2] by the distance between the lines. Differences 
remain large, on the order of 20%, which is much larger than the relative errors of the mean 
value (see Table [2]) and relative dispersions (see Table [3]). This difference is less important 
for estimators based on mi, as the relative errors of the mean value and relative dispersions 
are comparable to the differences between samplings. Due to this fact it is not efficient to 
use statistics on the most massive stars for distinguishing between various samplings. 

Thus it is crucial to know which sampling method is more realistic, although there is 
still some discussion about it (see the Introduction). It is also important to notice, that 
current mass estimates for both mmax and can have errors as high as 50%. 

Another check for the reliability of the obtained estimators is to try to apply them to 
the "wrong" dataset, for example — using the median estimator from sorted sampling (see 
Section 14. 3p to estimate masses for the random one (see Section 14. ip or to use an estimator 




(11) 
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from a dataset with mmax = ISOM© to estimate masses for the mmax = SOOM© dataset 
etc. An example of this is shown in Table Here mmax "was varied: a dataset with one 
^max was used to build an estimator function (source dataset) that was then applied to the 
dataset with another mmax (target dataset). It should be noted, that mmax in the target 
dataset cannot exceed that of the source dataset, as the function from Equation [TOl will 
be undefined. Diagonals in this Table (i.e., values with equal source and target datasets) 
are the same as columns 8 and 10 in Table [21 As expected, errors increase with increasing 
difference between the source and target dataset 's mmax- However is better in almost 
all cases, but the median estimator is as sensitive to mmax variations as the average one, 
showing errors of up to 75%. On the other hand, relative estimate dispersions decrease 
rapidly with mmax difference, as the probability to have mi close to mmax, where one can 
get large errors, becomes smaller. It is also important that values of mmax that are smaller 
than I5OM0 are not realistic, and mmax = 50Mq was introduced just to study the effect of 
a large range of parameter values. 

5. Conclusions 

Several mass estimators for cluster mass from the first, second and third most massive 
stars were defined in this paper. Their precision was estimated. Estimators based on the 
mass of the third massive member rris gave the best results (approximately 3-5 times better 
than those based on mi), and are less dependent on the maximum allowed stellar mass 
mmax and assumed way of star formation (algorithm for picking masses from the IMF). We 
found that it is also better to build estimators on the median or mode values of m, instead 
of the average values. The reason is that the strong power-law tails in the mj distributions 
make the average value a less representative parameter. 

The most important parameter is the assumed algorithm describing how the cluster 
mass is distributed among stars. 
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However, as several astrophysical effects were not taken into account, tfiese results 
cannot yet be applied to most of the real clusters. Inclusion of evolution into this model 
is a subject for further work. Here it was shown that rris is a good candidate for building 
mass estimators. Error analysis was also carried out and revealed a power-law tail in the 
error distribution. We showed that the median (or mode) values are much better sources 
for mass estimators than the average values. 
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Table 1: Parameters of fits (see Eq. [TD]) 



Fits based on... average values 


...median values 


... mo 


de values 


Function 


a 


b 


c 


a 


b 


c 


a 


b 


c 




504.90 


1.55 


-1.23 


512.24 


1.30 


-0.94 


30.09 


1.35 


0.00 


Mel(m2) 


2492.00 


1.35 


-1.17 


623.04 


1.34 


-0.82 


42.32 


1.40 


0.00 


M,i(m3) 


4106.74 


1.31 


-1.13 


865.90 


1.34 


-0.79 


60.48 


1.38 


0.00 


N{mi) 


1431.05 


1.55 


-1.23 


1894.15 


1.26 


-0.97 


83.58 


1.35 


0.00 


N{m2) 


7091.91 


1.35 


-1.17 


2659.81 


1.30 


-0.88 


117.55 


1.40 


0.00 




11770.94 


1.31 


-1.14 


4241.68 


1.31 


-0.89 


168.00 


1.38 


0.00 
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Table 2: Relative error (in percents) of mean value of estimated masses 





Random sampling 


Constrained sampling 


Sorted sampling 


Value 


/K) 






/(mi) /(ma) /(mg) 


/(mi) 


f{m2) 


f(m,) 


Median estimator 


N 


0.88 


0.28 


0.41 


0.48 0.32 0.27 


1.77 


1.89 


1.90 


M 


1.37 


0.99 


1.05 


0.36 0.20 0.12 


0.36 


0.23 


0.13 


Average estimator 


N 


23.23 


20.13 


15.29 


12.44 13.66 11.43 


18.09 


10.41 


6.43 


M 


23.24 


20.14 


15.29 


12.48 13.69 11.45 


19.08 


11.99 


8.44 


Mode estimator 


N 


74.97 


43.85 


25.87 


19.44 15.66 12.99 


7.41 


6.35 


4.17 


M 


74.82 


43.73 


25.76 


21.27 9.29 7.79 


7.92 


7.63 


4.67 
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Table 3: Relative dispersion (in percents) for mass estimations 





Random sampling 


Constrained sampling 


Sorted sampling 


Value 


/K) 




fM 


/(mi) /(ma) /(mg) 


/K) 




fims) 


Median estimator 


N 


46.02 


1.80 


0.74 


13.56 1.84 0.66 


122.50 


1.17 


0.43 


M 


39.00 


1.62 


0.70 


12.97 1.77 0.65 


111.41 


1.13 


0.43 


Average estimator 


N 


264.68 


3.38 


0.82 


35.47 3.11 0.73 


1296.57 


1.99 


0.46 


M 


259.82 


3.34 


0.81 


34.34 3.04 0.73 


547.78 


1.00 


0.36 


Mode estimator 


N 


1.08 


0.84 


0.58 


0.41 0.60 0.42 


0.77 


0.44 


0.32 


M 


1.08 


0.84 


0.58 


0.38 0.56 0.39 


0.79 


0.46 


0.33 
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Table 4: Relative difference (in percents, see Eq. [TT]) between estimators for random and 
other sampling 



Sampling 


A/(mi) A/K) 


A/(m3) 


Average estimator 


Constrained 


24 17 


18 


Ordered 


92 38 


21 


Median estimator 


Constrained 


20 14 


14 


Ordered 


89 55 


45 


Mode estimator 


Constrained 


71 66 


67 



Ordered 34 45 47 
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Table 5: Relative error for estimators applied to the dataset different to the one the functions 
were built on (Sorted sampling only). 



"^max (source dataset) 




?7T.max (target dataset) 


Average ME Median ME 




300 


150 50 300 150 50 


Estimators based on mi 


300 


26.37 


51.03 81.12 2.10 34.47 74.66 


150 




18.09 77.91 - 1.77 70.67 


50 




16.53 - - 2.36 


Estimators based on 


300 


7.28 


21.97 56.21 2.12 17.44 53.82 


150 




6.43 51.76 - 1 .90 49.36 


50 




5.43 - - 2.30 
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